Event Extraction from Classical Arabic Texts
نویسندگان
چکیده
Event extraction is one of the most useful and challenging Information Extraction (IE) tasks that can be used in many natural language processing applications in particular semantic search systems. Most of the developed systems in this field extract events from English texts; therefore, in many other languages in particular Arabic there is a need for research in this area. In this paper, we develop a system for extracting person related events and their participants from classical Arabic texts with complex linguistic structure. The first and most effective step to extract event is the correct diagnosis of the event mention and determining sentences which describe events. Implementation and comparing performance and the use of various methods can help researchers to choose appropriate method for event extraction based on their conditions and limitations. In this research, we have implemented three methods including knowledge oriented method (based on a set of keywords and rules), data-oriented method (based on Support Vector Machine (SVM)) and semantic oriented method (based on lexical chain) to automatically classify sentences as on-event or off eventones. The results indicate that knowledge oriented and machine learning methods have high precision and recall in event extraction process. The semantic oriented method with acceptable precision minimizes the linguistic knowledge requirements of knowledge oriented method and preprocessing requirements of data oriented method; and also improves automatic event extraction process from the raw text. Next step is developing a modular rule based approach for extracting event arguments such as time, place and other participants involved in independent subtasks.
منابع مشابه
Mani’s Living Gospel: A New Approach to the Arabic and Classical New Persian Testimonia
In order to reconstruct the contents of the most famous work of Mani, Living Gospel (written originally in Syriac), we have to use the Arabic and Classical New Persian texts containing accounts and even indirect quotations of this book. One of the most remarkable points in these accounts is that they clearly show that an important part of the Living Gospel contains the Manicha...
متن کاملTashkeela: Novel corpus of Arabic vocalized texts, data for auto-diacritization systems
Arabic diacritics are often missed in Arabic scripts. This feature is a handicap for new learner to read َArabic, text to speech conversion systems, reading and semantic analysis of Arabic texts. The automatic diacritization systems are the best solution to handle this issue. But such automation needs resources as diactritized texts to train and evaluate such systems. In this paper, we describe ...
متن کاملتشخیص اسامی اشخاص با استفاده از تزریق کلمههای نامزد اسم در میدانهای تصادفی شرطی برای زبان عربی
Named Entity Recognition and Extraction are very important tasks for discovering proper names including persons, locations, date, and time, inside electronic textual resources. Accurate named entity recognition system is an essential utility to resolve fundamental problems in question answering systems, summary extraction, information retrieval and extraction, machine translation, video interpr...
متن کاملClassifying and Segmenting Classical and Modern Standard Arabic using Minimum Cross-Entropy
Text classification is the process of assigning a text or a document to various predefined classes or categories to reflect their contents. With the rapid growth of Arabic text on the Web, studies that address the problems of classification and segmentation of the Arabic language are limited compared to other languages, most of which implement word-based and feature extraction algorithms. This ...
متن کاملUsing text mining to identify crime patterns from Arabic crime news report corpus
Most text mining techniques have been proposed only for English text, and even here, most research has been conducted on specific texts related to special contexts within the English language, such as politics, medicine and crime. In contrast, although Arabic is a widely spoken language, few mining tools have been developed to process Arabic text, and some Arabic domains have not been studied a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Int. Arab J. Inf. Technol.
دوره 12 شماره
صفحات -
تاریخ انتشار 2015